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© The present invention provides a novel £1-6 rV-ace tylgtucosaminy (transferase, which forms core 2 oligosec- 
chande structures in O-glycans. and a novel acceptor molecu/e. teukosialin, CD43. for core 2 £1-6 W- 
acetylglucosaminyltransferase activity. The amino acid sequences and nucleic acid sequences encoding these 
molecules, as well as adive fragments thereof, also are disclosed. A method for isolating nucleic acid sequences 
encoding proteins having enzymatic activity is disclosed, using CHO cells that support replication of plasmid 
vectors having a polyoma virus origin of replication. A method to obtain a suitable cell line that expresses an 
acceptor molecule aJso is disclosed. 
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This work was supported by grants CA33000 and CA33895 awarded by the National Cancer Institute. 
The United States Government has certain rights In Ihis invention. 

BACKGROUND OF THE INVENTION 

FIELD OF THE INVENTION 



15 



This invention relates generally to the fields of biochemistry and molecular biology and more specific 
cally to a novel human enzyme, UDP^GJcNAc:Gal01-3GalNAc (GteNAc to GalNAc) 01-6 r^acetyl- 
glucosaminyltransferase (core 2 01-6 rV-aceiylglucosaminyltransrerase; C2GnT). and to a novel acceptor 
molecule, (eukoslailn, CD43, For core 2 01-6 rV-acetylglucosaminyltransferase action. The invention addi^ 
tionally relate* to DMA sequences encoding core Z 01-6 rV-acetylglucosaminyltransfe/ase and Jeukoaiafin 
to vectors containing a C2GnT DNA sequence or a leukosialin DNA sequence, to recombinant host cells 
transformed with such vectors and to a method of transient expression cloning in CHO cells for identifying 
and isolating DNA sequences encoding specific proteins, using CHO cells expressing a suitable acceptor 
molecule. 



BACKGROUND INFORMATION 

so Most O-glycosidic oligosaccharides in mammalian glycoproteins are linked via ^acetylgalactosamine to 
the hydroxyl groups of serine or threonine. These O-glycans can be classified into 4 different groups 
depending on the nature of Ihe core portion of the oligosaccharides (see Fig. l). Although less well studied 
than N-glycans, O-glycans likely have Important biological functions. Indeed, the presence of CHinked 
oligosaccharides with the core 2 branch, Gal0l-3(GIcNAc01-6)GalNAc. has been demonstrated in many 

25 biological processes* 

Piller et ai., J. Biol. Chem 263:15146-15150 (1988) reported that human T-cell activation is associated 
w.th the conversion of core 1 -based tetrasaccharides to core 2-based hexasaccharides on teukosialin a 
major slatogiycoprotein present on human T lymphocytes (see also Fig. 1). A similar increase in hexasac- 
chandes was observed in peripheral blood lymphocytes of patients suffering from T-cell leukemias (Saitoh 
30 et al., Blood 77:1491-1499 (1991)), myelogenous leukemias (Brockhausen et al., Cancer Res. 51:1257-1263 
(1991)) and immunodeficiency due to AIDS and the Wiskott-Aldrich syndrome (Piller et al.. Exp. Med 
173:1501-1510 (1991)). In these patients' lymphocytes, changes in the amount of hexasaccharides were 
caused by increased activity of either UDP-GlcNAc:Gal01-3GaINAc (GlcNAc to GalNAc) 6-0-r>/*acetyl- 
glucosammyltransferase (EC2.4.1.102) or core 2 01-6 rt-acetylglucosaminy I transferase (Williams et al J 
35 B,Ql. Che nv 255:11253-11261 (1980)). Increased activity of core 2 01-6 ^acetylglucosaminyltransferast 
also was observed in metastatic murine tumor cell lines as compared to their parental, non-metastatic 
counterparts (You9efi et al.. J. Btoj. Chem. 266:1772-1782 (1991)). 

Increased complexity of the attached oligosaccharides increases the molecular weight of the 
glycoprotein. For example, leukosialin containing hexasaccharides has a molecular weight of -135kDa, 
<o whereas leukosialin containing tetrasaccharides has a molecular weight of -lOSkDa (Carlsson et al J Biol 
f^em. 261:12779-1 2786 and 12787-1 2795 (1986)). " " — ~ 

Fox et al., J. Immunol. 131:762-767 (1983) raised a monoclonal antibody, T305, against human T- 
lymphocytic leukemia cells. Sportsman et al.. J. Immunol. 135:158-164 (1985) reported T3Q5 binding was 
abolished by neuraminidase treatment, suggesting T305 binds to hexasaccharides. T305 specifically reacts 
45 with the high molecular weight form of leukosialin (Saitoh et al., supra . (1991)). 

Previous studies indicated poly*rV-acetyllactosarnine repeats extend almost exclusively from the branch 
formed by the core 2 01-6 rV-acetylgiucosaminyltransferase (Fukuda et al., J. Biol. Chem. 261:12796- 
12806 (1986)). Consistent with these results, Yousefi et al., supra , (1991) demonstrated that the core 2 
enzyme In metastatic tumor cells regulates the level of poly-rV-acetyllactosamlne syndesis in Ofinked 
so oligosaccharides. 

Poly-rV-acetyllactosamines are subject lo a variety of modifications, including the formation of the sialyl 
Ue . NeuNAca^aGaWI^CFuCal^GlcNAc-, or the sialyl Le a , NeuNAc«2-3Gal0l-3 <FuCol-4)GlCNAC- 
. determinants (Fukuda, Blochlm. Btonhys. Acta 780:119-150 (1985)). Such modifications are significant 
because these determinants, which are present on neutrophils and monocytes, serve as ligands for E- and 
55 P-selectin present on endothelial cells and platelets, respectively (see. for example. Larsen el al Cell 
83:467-474 (1990)). — 
In addition, tumor cells often express a significani amount of sialyl Le x and/or sialyl L e » on Iheir cell 
surfaces. The interaction between E-selectin or P-selectin and these cell surface carbohydrates may play a 
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role In tumor cell adhesion to endothelium during the metastatic process (Waiz et ai., supra, (1990)). Kojima 
eJ a '»» Biochem. Biophys. Res. Commun, 1B2;128B-I295 (1992) reported that selectin-"dependent tumor cell 
adhesion to endothelial cells was abolished by blocking O-glycan synthesis. Complex sulfated O-glycans 
also may serve as ligands Tor the lymphocyte homing receptor, L-selectin (Imai el al., J. Cell Biol 113 1213- 

s t22t (1991)). 

These reported observations establish core 2 /J 1^6 rV-acetylglucosaminy I transferase as a critical 
enzyme in O-glycan biosynthesis. The availability of core £ 01-6 //-acetylgkicosaminyltransferase will 
aJlow the in vivo and in vitro production of Specific glycoproteins having core 2 oligosaccharides and 
subsequent study of these variant O-glycans on cell-cell interactions. For example, core 2 01 — 6 N- 

fO acetylglucoseminyltransferase is a useful marker for transformed or cancerous cells. An understanding of 
the role Of core 2 01—6 A/-acetylglucosaminyltransferase in transformed and cancerous cells may elucidate 
a mechanism for the aberrant cell-cell interactions observed in these cells. In order to understand the 
control of expression of these oligosaccharides and their function, isolation of a cDNA clone for core 2 
01-6 ^acetylglucoseminyltransferase is a prerequisite. Howevar. the DNA sequence encoding core 2 
is 01—6 Af-acetylglucosaminyltransferase has not yet been reported. 

Thus, a need exists for identifying the core 2 01 -*6 rV-acetylglucosaminyrtransferase and the DNA 
sequences encoding this enzyme. The prasent Invention satisfies this need and provides related advan- 
tages as well. 

M SUMMARY QF THE INVENTION 

The present invention generally relates to a novel purified human 01—6 ry-acetylglucosaminyitrans- 
ferase. A cONA sequence encoding a 428 amino acid protein having 0 1—6 AAacetylglucosaminyltransferase 
activity also is provided. The purified human 01—6 fV-acetyigiucasaminyltransferase. or an active fragment 
?5 thereof, catalyzes the formation of critical branches in O-glycans. 

The invention further relates to a novel purified acceptor molecule, leukosialin. CD43. for core 2 01—6 
rV-acetytglucosaminyrtransferase activity. The leukosialin cDNA encodes a novel variant leukosialin. which 
is created by alternative splicing of the genomic leukosialin DNA sequence. 

Isolated nucleic acids encoding either core 2 01-6 rV-ecetylglucosaminyltransferase or leukosialin are 
30 disclosed, as are vectors containing the nucleic acids and recombinant host cells transformed with such 
vectors. The invention further provides methods of detecting such nucleic acids by contacting a sample with 
a nucleic acid probe having a nucleotide sequence capable ol hybridizing with the isolated nucleic acids of 
the present invention. The core 2 01-6 rV-acetylglucosaminyltransferase and leukosialin amino acid and 
nucleic acid sequences disclosed herein can be purified from human cells or produced using well known 
35 methods of recombinant DNA technology. 

The Invention also discloses a method of isolating nucleic acid sequences encoding proteins that have 
an enzymatic activity. Such a nucleic acid sequence is obtained by transfecting the nucleic acid, which is 
contained within a vector having a polyoma virus replication origin, into a Chinese hamster ovary (CHO) cell 
line simultaneously expressing polyoma virus large T antigen and the acceptor molecule for the protein 
<o having an enzymatic activity. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Rgure 1 depicts the structures and biosynthesis of O-glycans. Structures of O-glycan cores can be 
Classified into 4 groups (core 1 to core 4). each of which is synthesized starting with GaJNAcal— Ser/Thr. 
The core i structure Is synthesized by the addition of a 01—3 Gal residue to the GalNAc residue. The core 
1 structure can be converted to core 2 by the addition of a 01—6 tf~acetylglucosaminyl residue. This 
intermediate is usually converted to the hexasaccharide by sequential addition of galactose and sialic acid 
residues (bottom right). The core 2 01— 6 AA-acetylgiucosaminyltransferase and the linkage formed by the 
enzyme are indicated by a box. tn certain celt types, the core 2 structure can be extended by the addition 
of /^acetyliactosamlne (Gal0i-4GlcNAc0l-3) repeats to form poly-/V-acetyllactosamine. In the absence of 
core 2 01—6 rv^acetylgiucosaminyltransferase. core 1 is converted to the monosiatoform, then to the 
disialoform by sequential addition of a2— 3* and «2— 6-llnked sialic acid residues (bottom left). Alternatively, 
core 3 can be synthesized by the addition or a 01— 3 rV-acetylglucosamlnyl residue to the Gal N Ac residue. 
Core 3 can be converted to core 4 by another 01—6 Af-acetyiglucosaminyKransf erase (top of figure). 

Figure 2 depicls genomic DNA sequence (SEQ. ID. NO. 1) and cDNA sequence (SEQ. ID. NO. 1) of 
leukosialin. The genomic sequence is numbered relative to the transcriptional start site. Exon 1 and exon 2 
have been previously described. Exon V is newly identified here. In the isolated cDNA p exon V is 
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immediately followed toy ihe exon 2 sequence. Deduced amino acids (SeQ. ID. NO. 2) are presented under 
the coding sequence, which begins in exon 2. A portion of the exon 2 sequence Is shown. 

Figure 3 establishes the ability of pGT/hQG to replicate In CHO cell lines expressing polyoma large T 
antigen and leukosiaiin. in panel A, six clonal CHO ceil lines were examined for replication of pcDNAI-based 

s pGT/hC<3 (lanes 1-6). In panel B, replication of cell done 5 (CHO-Py-Jeu), was further examined by 
treatment with increasing concentrations of Dpnl and XhoJ (lanes 2 and 3). Plasmid DNA isolated from 
MOP-6 celts was used as a control (lane 1). Plasmid DNA was extracted using the Hirt procedure and 
samples were digested with Xhol and Dpnl. In parallel, pGT/hCG plasmid purified from col! MC1061/P3 
was digested with Xhol and Dpnl (lane 7 in panel A and lane 4 in panel B) or Xhof alone (lane 8 in panel A 

ro and lane 5 in panel B). The arrow indicates the migration of plasmid DNA resistant to Dpnl digestion. The 
arrowheads indicate plasmid DNA digested by Dpnl. 

Figure 4 shows the expression of T305 antigen expressed by pcDNAl-C2GnT. Subconfluent CHO-Py- 
teu cells were transfected with pcDNAJ-C2GnT (panels A and B) or mock-transfecled with pcDNAI (panels C 
and D). Sixty four hours after transection, the cells were fixed, then incubated with mouse T305 monoclonal 

re antibody followed by fluorescein isocyanate-conjugated sheep anti-mouse JgQ (panels A, B and Q). Two 
different areas are shown in panels A and B. Panel D shows a phase micrograph of the same field shown in 
panel C. Bar - 20 urn. 

figure 5 depicts the cDNA sequence (SEQ. ID. NO. 3) and translated amino acid sequences (SEQ. ID. 

NO. 4) of core 2 £l-*6 AAacetylglucosaminyltransferase The open reading frame and fuIHength nucleotide 
20 sequence of C2GnT are shown, The signal/membrane-anchoring domain is doubly underlined. The 

polyadenylation signal is boxed. Potential W-glycosylation sites are marked with asterisks. The sequences 

are numbered relative to the translation start site. 

Figure 6 shows the expression of core 2 6 Af-acetylglucosaminyltransferese mRNA in various cell 

types. Poly(A)+ RNA (11 ug) from CHO-Py-leu cells (lane 1), HL-60 promyelocytes (lane 2). K562 
25 erythrocytic cells (lane 3). and SP and L4 colonic carcinoma cells (lanes 4 and 5) was resolved by 

electrophoresis. RNA was transferred to a nylon membrane and hybridized with a radlolabelad fragment of 

pPROTA-C2GnT. Migration Of RNA size markers is indicated. 

Rgure 7 illustrates the construction of the vector encoding the protein A-C2GnT fusion protein. The 

cDNA sequence corresponding to Pro» to His 428 was fused in frame with the IgG binding domain of 5. 
do aureus protein A (bottom; SEO. ID. NOS. 7 and 8). The sequence includes the deavable signal peptide, 

which allows secretion of the fused protein. The coding sequence is under control of the SV40 promoter. 

The remainder of the vector sequence shown was derived from rabbit fl-globin gene sequences, including 

an intervening sequence (IVS) and a polyadenylation signal (An). 

35 DETAILED DESCRIPTION OF THE IN VENTION 

The present invention generally relates to a novel human core 2 /JH6 AZ-acetylglucosaminyltransferaSB. 
The invention further relates to a novel method of transient expression cloning in CHO cells that was used 
to isolate the cDNA sequence encoding human core 2 01-*6 /V-acetylglucosaminyltransferase (C2GnT). The 

4Q invention also relates to a novel human leukosialin, which is an acceptor molecule for core 2 01—6 N- 
acetylglucosaminyltransferase activity. 

Cells generally contain extremely low amounts of gly cosy Itransf erases. As a result, cDNA cloning based 
on screening using an antibody or a probe based on the glycosyltransferase amino acid sequence has met 
with limited success. However, isolation of cDNAs encoding various glycosyltransf erases can be achieved 

45 by transient expression of cDNA in recipient cells. 

Successful application of the transient expression doning method to isolate a cDNA sequence encoding 
a glycosyltransferase requires an appropriate recipient cell line. Ideal recipient cells should not express the 
grycosyltranslerase of interest. As a result, the recipient cells would normally lack the oligosaccharide 
structure formed by such a glycosyltransferase. 

so Expression of the cloned glycosyltransferase cDNA in the recipient cell line should result in formation of 
the specific oligosaccharide structure. The resultant oligosaccharide can be identified using a specific 
antibody or lectin that recognizes the structure. The recipient cell line also must support replication of an 
appropriate plasmid vector. 

COS-1 cells initially appear to satisfy the requirements for using the transient expression method. COS- 

ss 1 cells express SV40 Jarge T antigen and support the replication of plasmid vectors harboring a SV40 
replication origin (Gluzman et al., Cell 23:175-182 (1981)). Although COS-l cells, themselves, express a 
variety of glycosyUransferases, COS-1 cells have been used to clone cDNA sequences encoding human 
blood group Lewis «1-*3/4 fucosyRransferase and murine al— 3 galactosy itransf erase (Kukowska-Lataito et 
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al.. Genes and DqvgI. 4:1283-1303 (1990); Larsen et al.. Proc. Natl. Acad. Sti. USA 86:8227-8231 (19B9)). 
Also, Goelz et al.. Cell 63:175-182 (1990), utilized an antibody that inhibits E-$electin mediated adhesion to 
isolate a cDNA sequence encoding a i-*3 fucosyltransferase. 

An attempt was made to use COS-1 cells to isolate cDna clones encoding core Z 01-* 6 rV- 
s acetyiglucosaminyltransferase. COS-1 cells were transfected using cDNA obtained from activated human T 
cells, which express the core 2 1—6 /^acetylgiucc^aminyitransferase. Transacted cells suspected of 
expressing core 2 01—6 Afac*tylgluco$aminyltransferase in the transfected cells were Identified by the 
presence of increased levels of the core 2 oligosaccharide structure formed by core 2 £1—6 AA 
ocetylglucosaminyrtransferase activity. The presence of the core 2 structure was identified using the 

10 monoclonal antibody, T305, which identifies a hexasaccharida on leukosialin. A clone expressing high levels 
of the T305 antigen was isolated and sequenced. 

Surprisingly, translation using COS-1 cells resulted in the isolation of a cDNA clone encoding a novel 
variant of human leukosiatin. which is the acceptor molecule for core 2 01— 6 W-acerylglucosaminyltrans- 
ferase activity. Examination of the cDNA sequence of the newly isolated leukosialin revealed the cDNA 

is sequence was formed as a result of alternative splicing of exons in the genomic leukosialin DNA sequence. 
Specifically, the newly isolated leukosialin is encoded by cDNA sequence containing e previously un- 
described non-coding exon at the S'-terminus (exon V in Figure 2; SEQ. ID. NO. 1), 

The unexpected result obtained using COS-1 cells led to the development of a new transfection system 
to isolate a cDNA sequence encoding core 2 /Si— 6 Af-acetylglucosaminyltransf erase. CHO cells, which do 

?o not normally express the T305 antigen, were transfected with DNA sequences encoding human leukosialin 
and the polyoma virus large T antigen. A cell line, designated CHO-Py-leu. which expresses human 
leukosialin and polyoma virus large T antigen, was isolated. 

CHO-Py-leu cells were used for transient expression cloning of a cDNA sequence encoding core 2 
01^6 rV-acetylglucosaminyrtransferase. CHO-Py-leu cells were transfected with cDNA obtained from human 

25 HL-60 promyelocytes A plasmid. pcDNAI-02Gnt. which directed expression of the T30S antigen, was 
isolated and the cDNA insert was sequenced (see Figure 5; SEQ. ID. NO. 3). The 2105 base pair cDNA 
sequence encodes a putative 428 amino acid protein (SEQ. ID. NO. 4), The genomic DNA sequence 
encoding can be isolated using methods well known to those skilled in the art, such as nucleic acid 
hybridization using Die core 2 /J1-> 6 AAacetylglucosaminyltransferase cDNA disclosed herein to screen, for 

ao example, a genomic library prepared from HL-60 promyelocytes. 

An enzyme similar to the disclosed human core 2 jfll— 6 A/-acetylg1ucosaminyltransferase has been 
purified from bovine Irachea! epithelium (Ropp et al.. J. Biol. Chem. 266:23863-23871 (1991). which is 
incorporated herein by reference. The apparent molecular weight of the bovine enzyme is ~69kDa. In 
comparison, the predicted molecular weight of the polypeptide portion of core 2 £1— 6 /V-acetyl- 

35 glixjosaminyrtransf erase is -50kDa. The deduced amino acid sequence of core 2 £l-*6 rV-acetyl- 
glucosaminyllransferase reveals two to three potential AAglycosylation sites, suggesting AAglycosylation and 
O-glycosylatlon, or other post-translabonal modification, could account for the larger apparent size of the 
bovine enzyme. 

Expression of the cloned C2GnT sequence, or a fragment thereof, directed formation of the specific O 

ao glycan core 2 oligosaccharide structure. Although several cDNA sequences encoding glycosyltransferases 
have been isolated (Paulson and Colley, J. Biol. Chem. 264:17615-17616 (1989); Schachtar, Curr. Opin. 
Struct. Bloj. 1:765-765 (1991), which are incorporated herein by reference), C2GnT is the first reported 
cDNA sequence encoding an enzyme involved exclusively in Oglycan synthesis. 

In O-giycans, jJH8 rV-acetylglucosaminyl linkages may occur in both core 2, Ga!01^3(GlcNAc£l-6)- 

4s GalNAc, and core 4, GlcNAc^l-^GlcNAc^l^GalfsfAc. structures (Brockhausen et al., Biochemistry 
24:1866-1874 (1985). which is incorporated herein by reference. In addition, /J1-6 A^acetylglucosaminyl 
linkages occur in the side chains of poIy-Af-acetyilactosamine, forming the l-structure (Piller et al., J. Biol. 
Chem. 259:13385-13390 (1984), which is incorporated herein by reference), and In the side chain attached 
to a-mannose of the AAglycan core structure, forming a tetraantennary saccharide (Cummlngs et al.. J. Biol. 

so Chem. 257:13421-13427 (1982). which is Incorporated herein by reference). The enzymes responsible for 
these linkages all share the unique property that Mn* + is not required for their activity. 

Although It was originally suggested that these 01—6 rV-acetylglucosamlnyl linkages were formed by 
the same enzyme (Piller at aL. 1984). the present disclosure dearly demonstrates that the HL-60-derived 
core 2 /J1-*6 rV-acetylglucosaminyitransferase is specific for the formation only of Oglycan core 2. This 

55 result is consistent with a recent report demonstrating that myeloid cell lysates contain the enzymatic 
activity associated with core 2, bul not core 4. formation (Brockhausen et al., supra , (1991)). 

Analysis of mRNA isolated from colonic cancer cells indicated core 2/31—6 rV-acelylglucosaminyltrans- 
ferase is expressed in these cells. Recent studies using affinity absorption suggested at least two different 
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/3i-»6 A^acetylglucosaminyltransferases were present in tracheal epithelium (Ropp et al., supra , (1991)). 
One or these transferases formed core 2, core 4, and I structures. Thus, at least one other 01— 6 
acetylglucosaminyltransferase present In epithelial cells can form core 2, core 4 and I structures. Similarly, 
a £1—6 A^-acetylglucosaminyliransferase present in NovikofT hepatoma cells can form both core 2 and I 
s structures (Koenderman et al., Eur. J. Biochem. 156:199-208 (1967). which Is Incorporated herein by 
reference). 

The acceptor molecule specificity of core 2 /n— 6 A/-acetylglucosaminyl transferase is different from the 
Specificity of the enzymes present in trachea} epithelium and Novikoff hepatoma ceils. Thus, a family of 
/S1— 6 A^acetylglucosaminyltransferases can exist, the members of which differ in acceptor specificity but 
io are capable of forming the same linkage. Members of this family are isolaled from cells expressing 01—6 
AZ-acotylsfucosaminyltransferase activity, using, for example, nucleic acid hybridization assays end studies 
of acceptor molecule specificity. Such a family was reported for the «1— 3 fucosyttransferases (Weston et 
al., J. Biol. Chem. 267:4152-4160 (1992), which is incorporated herein by reference). 

The formation of the core 2 structure is critical to cell structure and function. For exampJe, the core 2 
ts structure Is essential for elongation of poly-rV-ecetyllectosamine end for formation of sialyl Le x or sialyl Le B 
structures. Furthermore, the biosynthesis of cartilage keratan sulfate may be Initiated by the core 2 01—6 
AZ-acetylglucosarninyUrensferase, since the keratan sulfate chain is extended from a branch present in core 
2 structure in the same way as poly-rV-acetyllaclosamine (Dickenson et a!-, Biochem. J. 289:55-59 (1990), 
which is incorporated herein by reference). Keratan sulfate is absent in wild-type CHO cells, which do not 
20 express the core 2 01—6 AZ-acetylglucosaminynransferaso (Esko et el.. J. Biol. Chem. 261:15725-15733 
(1988). which is incorporated herein by reference). These structures are believed to be important for cellular 
recognition and matrix formation. The availability of the cDNA clone encoding the core 2 01—6 iV- 
acetylglucosaminyltransferase will aid in understanding how the various carbohydrate structures are formed 
during dfferentiation and malignancy. Manipulation of the expression of the various carbohydrate structures 
25 by gene transfer and gene inactivaUon methods will help elucidate the various functions of these structures. 
The present invention is directed to a method for transient expression cloning in CHO cells of cDNA 
sequences encoding proteins having enzymatic activity. Isolation of human core 2 01-*6 Atacetyl- 
glucosaminyltransferase is provided as an example of the disclosed method. However, the method can be 
used to obtain cDNA sequences encoding other proteins having enzymatic activity. 
30 For example, lectins and antibodies reactive whh other specific oligosaccharide structures are available 
and can be used to screen for glycosyltransferase activity. Also, CHO cell lines that have defects in 
glycosylation have been Isolated. These cell lines can be used to study the activity of the corresponding 
glycosyltransferase (Stanley, Ann. Rev. Genet, 18.525-552 (1984), which is incorporated herein by refer- 
ence). CHO cell lines also have been selected for various defects in cellular metabolism, loss of expression 
as of cell surface molecules and resistance to cytotoxic drugs (see, for example, Malmslrom and Krieger, J. 
Biol. Chem. 266:24025-24030 (1991); Yayon el a!., Cefj 64:841-848 (1991), which are incorporated herein by 
reference). The approach disclosed herein should allow isolation of cDNA sequences encoding the proteins 
involved In these various cellular functions. 

As used herein, the terms ''purified" and "isolated" mean that the molecule or compound Is substan- 
40 tially free of contaminants normally associated with a native or natural environment For example, a purified 
protein can be obtained from a number of methods. The naturally-occurring protein can be purified by any 
means known in the art, including, for example, by affinity purification with antibodies having specific 
reactivity with the protein. In this regard, anti-core 2 01—6 rV-acetylglucosaminyhransferase antibodies can 
be used to substantially purify naturally-occurring core 2 01—6 A^acetylglurjc^aminyltransferase from 
45 human HL-60 promyelocytes. 

Alternatively, a purified protein of the present invention can be obtained by well known recombinant 
methods, utilizing the nucleic acids disclosed herein, as described, for example, in Sambrook et al.. 
Molecular Cloning: A Laboratory Manual 2d ed. (Cold Spring Harbor Laboratory 198S), which is incorporated 
herein by reference, and by the methods described in the Examples below. Furthermore, purified proteins 
so can be synthesized by methods well known in the art. 

As used herein, the phrase "substantially the sequence" includes the described nucleotide or amino 
acid sequence and sequences having one or more additions, deletions or substitutions that do not 
substantially affect the ability of the sequence to encode a protein have a desired functional actlvfty. In 
addition, the phrase encompasses any additional sequence that hybridizes to the disclosed sequence under 
ss stringent hybridization sequences. Methods or hybridization are well known to those skilled in the art. For 
example, sequence modifications that do not substantially alter such activity are intended- Thus, a protein 
having substantially the amino acid sequence of Figure 5 (SEO. ID. NO. 4) refers to core 2 01—6 rV- 
acetylglucosaminyitransforase encoded by the cDNA described in Example IV, as well as proteins having 
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amino acid sequences thai are modified but. nevertheless, retain the functions of core 2 /ih6 hh 
acetylglucosarninyltransferase. One skilled in the art can readily determine such retention of function 
following the guidance set forth, for example. In Examples V and VI. 

The present invention is further directed to active fragments of the human core 2 /3WQ N- 

s acetylglucosarninyltransferase protein. As used herein, an active fragment refers to portions of the protein 
that substantially retain the glycosyltr&nsferase activity of the intact core 2 /n-8 A/-acetyJglucowminyltrans- 
i erase protein. One skilled in the art can readily identify active fragments of proteins such as core Z 01 
rV.acetyl9tucofiamlnyltransferase by comparing the activities of a selected fragment with the intact protein 
following the guidance set forth in the Examples below. 

jo As used herein, the term "glycosyltransferase activity" refers to the function of a glycosyltransferase to 
link sugar residues together through a glycoside bond to create critical branches in oligosaccharides. 
Giyco3yl transferase activity results in the Specific transfer of a monosaccharide to an appropriate acceptor 
molecule, such that the acceptor molecule contains oligosaccharides having critical branches. One skilled in 
the art would understand the torms "enzymatic activity" and "catalytic activity " to generally refer to a 

js function of certain proteins, such as the function of those proteins having glycosyltransferase activity. 

As used herein, the term "acceptor molecule" refers to a molecule that is acted upon by a protein 
having enzymatic activity. For example, an acceptor molecule, such as leukosialin. as identified by the 
amino acid sequence of Figure 2 (SEQ. ID. NO. 2), accepts the transfer of a monosaccharide due to 
glycosyrtransferase activity. An acceptor molecule, such as leukosialin, may already contain one or more 

20 sugar residues. The transfer of monosaccharides to an acceptor molecule, such as leukosialin. results in the 
formation of critical branches of oligosaccharides. 

As used herein, the term "critical branches" refers to oligosaccharide structures formed by specific 
glycosyltransferase activity. Critical branches may be involved in various cellular functions, such as cell-ceil 
recognition. The oligosaccharide structure of a critical branch can be determined using methods well known 

2S in the art. Such as the method for determining the core 2 oligosaccharide structure, as described In 
Examples V and VI. 

Relatedly, the invention also provides nucleic acids encoding the human core 2 ^i— 6 /V-acetyl- 
glucosaminyltransferase protein and leukosialin protein described above. The nucleic adds can be in the 
form of DNA, fiNA Or cDNA. such as the novel C2GnT cDNA of 2105 base pairs identified in Figure 5 

30 (SEQ. ID. NO. 3) or the novel leukosialin cONA identified in Rgure 2 (SEQ. ID. NO. 1), for example. Such 
nucleic acids can also be chemically synthesized by methods known in the art, including, for example, the 
use of an automated nucleic acid synthesizer. 

The nucleic acid can have substantially the nucleotide sequence of C2GnT. identified in Figure 5 (SEO. 
ID. NO. 3), or leukosialin idenrjRed in Rgure 2 (SEQ. ID. NO. 1). Portions of such nucleic acids that encode 

as active fragments of the core 2 01—6 Afacetylglucosaminyltransferase protein or leukosialin protein of the 
present invention also are contemplated. 

Nucleic acid probes capable of hybridizing to the nucleic acids of the present Invention under 
reasonably stringent conditions can be prepared from the cloned sequences or by synthesizing 
oligonucleotides by methods known in the art. The probes can be labeled with markers according to 

40 methods known in the art and used to detect the nucleic acids of the present invention. Methods for 
detecting such nucleic adds can be accomplished by contacting the probe with a sample containing or 
suspected of containing the nucleic add under hybridizing conditions, and detecting the hybridization of the 
probe to the nucleic acid. 

The present invention is further directed to vectors containing the nucleic acids described above. The 
« term "vector* includes vectors that are capable of expressing nucleic acid sequences operably linked to 
regulatory sequences capable of effecting their expression. Numerous doning vectors are known in the art. 
Thus, the selection of an appropriate cloning vector is a matter of choice. In general, useful vectors tor 
recombinant DNA are often plasmide, which refer to circular double stranded Dna loops such as pcDNAI or 
pcDSHc. As used herein, "plasmid" and "vector" may be used interchangeably as the plasmid is a 
so common form of a vector. However, the invention is intended lo include other forms of expression vectors 
that serve equivalent functions. 

Suitable host cells containing the vectors of the present Invention are also provided. Host cells can be 
transformed with a vector and used to express the desired recombinant or fusion protein. Methods of 
recombinant expression in a variety of host cells, such as mammalian, yeast insect or bacterial cells are 
55 widely known. For example, a nucieic acid encoding core Z tfl— 6 rV-acetylglucosaminyrtransforase or a 
nucleic acrd encoding leukosialin can be transfected into cells using the calcium phosphate technique or 
other transfection methods, such as those described in Sambrook et al.. supra , (1969). 
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Alternatively, nucleic acids can be Introduced into cells by inrection with a retrovirus carrying the gene 
or genes of interest. For example, the gene can be cloned into a plasmid containing retroviral long terminal 
repeat sequences, the C2Gnt DNA sequence or ihe leukosialin DNA sequence, end en antibiotic resistance 
gene for selection. The construct can then be transfected Into a suitable cell line, such as PA12, which 
s carries a packaging deficient provims and expresses the necessary components for virus production, 
including synthesis of am pho trophic glycoproteins. The supernatant from these cells contain infectious 
virus, which can be used to infect Ihe cells of interest. 

Isolated recombinant polypeptides or proteins can be obtained by growing the described host celts 
under conditions that favor transcription' and translation of the transfected nucleic acid. Recombinant 
to proteins produced by the transfected host cells are isolated using methods set forth herein and by methods 
well known to those skilled in the art. 

Also provided are antibodies having specific reactivity with the core 2 r^acetyiglucosaminyltrans- 
ferase protein or leukosialin protein of the present invention, Active fragments of antibodies, for example. 
Fab and Fab* 2 fragments, having specific reactivity with such proteins are intended to fall within the 
is definition of an "antibody." Antibodies exhibiting a titer of at least about 1,5 x 10 s , es determined by ELISA, 
are useful in the present invention. 

The antibodies of the invention can be produced by any method known in the art. For example, 
polyclonal and monoclonal antibodies can be produced by methods described in Harlow and Lane, 
Antibodies: A Laboratory Manual (Cold Spring Harbor 1936), which is incorporated herein by reference. The 
20 proteins, particularly core 2 01—6 A^acetylglucosamlnyltransferase or leukosialin of the present invention 
can be used as immunogens to generate such antibodies. Altered antibodies, such as chimeric, humanized, 
CDR-grafted or Afunctional antibodies can also be produced by methods well kftown to those skilled in the 
art. Such antibodies can also be produced by hybridoma. chemical synthesis or recombinant methods 
described, for example, in Sambrook et aL. supra . (1989). 
2s The antibodies can be usad for determining the presence or purification of the core 2 01—6 N- 
acetylglucosaminyltransferasa protein or the leukosialin protein of the present invention. With respect to the 
detecting of such proteins, the antibodies can be used for in vitro or in vivo methods well known to those 
skilled in the art. 

Finally, kits useful for carrying out the methods of the invention are also provided. The kits can contain 
30 a core 2 01 — <5 W-acetylgluoosarninyltransf erase protein, antibody or nucleic acid of the present invention 
and an ancillary reagent. Alternatively, the kit can contain a leukosialin protein, antibody or nucleic acid of 
the present invention and an ancillary reagent. An ancillary reagent may include diagnostic agents, signal 
detection systems, buffers, stabilizers, pharmaceutical^ acceptable carriers or other reagents and materials 
conventionally included in such kits. 
35 A cDNA sequence encoding core 2 rV-acolylglucosarninyltransferase was isolated and core 2 

iflHS A^acetylglucosaminyltransferase activity was determined. This is the first report of transient expres- 
sion cloning using CHO cells expressing polyoma large T antigen. The following examples are intended to 
illustrate but not limit the present invention. 

40 EXAMPLE 1 

EXPRESSION CLONING IN C05-1 CELLS OF THE cDNA FOR THE PROTEIN CARRYING THE HEX- 
ASACCHARjPEg 

45 QOS-1 cells were transfected with a cDNA library, pcDSRo-2Fl, constructed from poIy(A)* RNA of 
activated T lymphocytes, which express the core 2 01—8 Af-acetylglucosaminyttransferase (Yokota et al., 
Proc. Natl. Acad. So. USA 83:5694-5898 (1986); Pillar et al.. supra , (1886), which are Incorporated herein 
by reference). COS-1 ceils support replication of the pcDSRa constructs, which contain the SV40 replication 
origin. Transfected cells were selected by panning using monoclonal antibody T305, which recognizes 

so siaiyiated branched hexasaccharides (Pi Her et al., supra , (1991); Saitoh et al., supra , (1991)). Methods 
referred to In this example are described In greater detail in the examples that follow. 

Following several rounds of transection, one plasmid, pcDSRa-lau, directing high expression of the 
T305 antigen was Identified. The cloned cDNA insert was isolated and sequenced, then compared with 
other reported sequences. The newly Isolated cDNA sequence was nearly identical to the sequence 

55 reported for leukosialin, except the S'-flanking sequences were different (Pallant et al., Proc. NaU. Acad. Sci. 
USA 66:1326-1332 (1989), which is incorporated herein by reference). 

Comparison of the cloned cDNA sequence with the genomic leukosialin DNA sequence revealed the 
start site Of the cDNA sequence is located 259 bp upstream of the transcription Start she of the previously 
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reported sequence (Figure 2: compare exon V and Exon i) (Shelley et ai., gjochem. J. 270:569-576 (1990); 
Kudo and Fukuda, J. Biol. Cham, 266:8483-8489 (1991), which are Incorporated herein by reference). A 
consensus splice site was identified at the exon-intron junction of the newly identified 122 bp axon 1' in 
pcDSRo-ieu (Breathnach and Chambon, Ann, Rev. Biochem. 50:348-383 (1981), which is Incorporated 

s herein by reference). This splice site is followed by the exon 2 sequence. 

These results indicate the T305 antibody preferentially binds to branched hexasaccharides attached to 
leukosialin. Indeed, a small amount of the hexasaccharides (approximately 8% of the total) was detected in 
Oglycans isolated from control COS-1 cells. T305 binding Is simitar to anti-M and anti-N antibodies, which 
recognize both the glycan and polypeptide portions of erythrocyte glycoprotein, grycophorin (Sadler et al M 

to J, Biol. Chem 254: 2112-2119 (1979). which is incorporated herein by reference). These observations are 
consistent with reports that only leukosialin strongly reacted with T305 in Western blots of leukocyte cell 
extracts, even though leukocytes also express other glycoproteins, such as CD45, that must also contain 
the same hexasaccharides (Piller et aK. supra , (1091); Saitoh et aL, supra , (1891)). 

ts EXAMPLE II 

ESTABLISHMENT OF CHO CELL LINES THAT STABLY EXPRESS POLYOMA VIRUS LARGE T ANTI - 
GEN AND LEUKOSIAUN _ " " 

20 T305 preferentially binds to branched hexasaccharides attached to leukosiaMn. Such hexasaccharides 
are not present on the erythropoietin glycoprotein produced in CHO cells, although the glycoprotein does 
contain the precursor tetrasaccharide (Sasaki et el., J. Biol. Chem. 262:12059-12076 (1987), which is 
incorporated herein by reference). T305 antigen also is not deteciable in CHO cells transiently transrected 
with pcDSRa-leu. tn order to screen for the presence of a cDNA clone expressing core 2 01—6 N- 

96 acetylglucosaminyhYansferase activity, a CHO cell line expressing both leukosialin and polyoma large T 
antigen was established (see. for example. Heffernan and Dennis Nucf. Acids Res. 19:85-92 (1991). which is 
incorporated herein by reference). — 

Vectors: A plasmid vector. pPSVEl -PyE, which contains the polyoma virus earty genes under the control 
of- the SV40 early promoter, was constructed using a modification of the method of Muller et aL, Mot. Cell. 

90 Biol. 4:2409-2412 (1984), which is incorporated herein by reference. Plasmid pPSVEl was prepared using 
pPSG4 (American Type Culture Collection 37337) and SV40 viral DNA (Bethesda Research laboratories) 
essentially as described by Feaiherstone et al., Nucl. Acids Res. 12:7235-7249 (1984). which is incor- 
porated herein by reference. Following EcoRl and Hindi digestion of plasmid pPyLT-1 (American Type 
Culture Collection 41043). a DNA sequence containing the carboxy terminal coding region of polyoma virus 
large T antigen was isolated. The Hincll site was converted to an EcoRl site by blunt-end ligation of 
phosphorylated EcoRl linkers (Stratagene). Plasmid pPSVEl-PyE was generated by inserting the carboxy- 
terminal coding sequence for large T antigen into the unique EcoRl site of plasmid pPSVEl. 

Plasmid pZIPNEO-leu was constructed by Introducing the EcoRl fragment ol PEER^3 cDNA, which 
contains the complete coding sequence for human leukosialin, into the unique EcoRl site of plasmid 

40 pZlPNEO (Cepko et al., Cell 37:1053-1063 (1984), which is incorporated herein by reference). Plasmid 
structures were confirmed by restriction mapping and by sequencing the construction sites. pZlPNEO was 
kindly provided by Dr. Channing Der. 

TransfecMon: CHODG44 cells were grown in 100 mm tissue culture plates. When the cells were 20% 
confluent, they were cc-transfected with a 1:4 molar ratio of pZlPNEOHeu and pPSVEl-PyE using the 
45 calcium phosphate technique (Graham and van der Eb, Virology 52:456-467 (1973), which is Iricorporated 
herein by reference). Translected cells were isolated and maintained in medium containing 400 ttg/ml 6- 
41B (active drug). 

Leukosialin expression: The total pool of G4i8-resistant transfectants was enriched for human leukosialin 
expressing cells by a one-step panning procedure using anti-leukosialin antibodies and goat anti-rabbit igG 
so coated panning dishes (Sigma) (Carlsson and Fukuda J, Bio?. Chem. 261:12779-12786 (1966), which is 
Incorporated herein by reference). Clonal ceil lines were obtained by limiting dilution. Six clonal cell lines 
expressing human leukosialin on the cell surface were identified by Indirect Immunofluorescence and 
isolated for further studies (Williams and Fukuda Cell BjoL 111:955-966 (1990). which Is Incorporated 
herein by reference). 

ss Polyoma virus-mediated replication: The ability of the six clonal cell lines to support polyoma virus large 
T antigen-mediated replication of plasmids was assessed by determining the methylation statue of 
transrected plasmids containing a polyoma virus origin of replication (Muller at al., supra . 1984: Heffernan 
and Dennis, supra . 1991). Plasmid pGT/hCG contains a fused 01—4 galactosyllransferase and human 
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chorionic gonadotropin a-chain DNA sequence inserted in plasmid pcDNAI, which contains a polyoma virus 
replication origin (Aoki et al. Proc. Natl. Acad. $cl.. USA 89. 4319-4323 (1992). which Is Incorporated herein 
by reference). 

Plasmid pGT/hCQ was isolated Irom melhylase-posilive £ constrain MC1081/P3 (Invitrogen), which 
s methylates the adenine residues in the Dpnl recognition site, "GATC". The methylated Opni recognition site 
is susceptible to cleavage by Dpnl. In contrast, ths Dpnl recognition site of plasmids replicated in 
mammalian cells is not methylated and, therefore, is resistant to Dpnl digestion. 

Methylated plasmid pGT/hCG was transacted by lipofection into each of the ai* selected clonal cell 
. lines expressing leukosialin- After 64 hr, low molecular weight plasmid DNA was isolated from the cells 
io using the method of Hirt, J. Mol. Biol. 26:365-369 (1967), which is incorporated herein by reference. Isolated 
plasmid DNA was digested with Xhol and Dpnl (Stratagene), subjected to electrophoresis in a 1% agarose 
gel, and transferred to nylon membranes (Micron Separations Inc.. MA). 

A 0.4 kb Smal fragment of the 01—4 galactoSyltransferase DNA sequence of pGT/hCG was radiolabel- 
ed with r32 P]dCTP using the random primer method (Feinberg and Vogelstein. Anal, Biochem. 132;6-13 
iff (1963), which is incorporated herein by reference). Hybridization was performed using methods well-known 
to those skilled in the ari (see, for example, Sambrook et al.. supra , (1989)). Following hybridization, the 
membranes were washed several times, including a final high stringency wash in 0.1 x $$PE, 0.1% SOS for 
1 hr at 65 • C, then exposed to Kodak X-AR film at -70 • C. 

Four o! the six clones tested, supported replication of the pcONAl-based plasmid. pGTVhCG (Fig. 3.A.. 
20 lanes 1 . 3. 4 and 5). MOP-8 cells, a 3T3 call line transformed by polyoma virus early genes (MuHer et al., 
Supra , (198*)), expresses endogenous core 2 £1—6 rV-acetylglucosaminyltransferase activity and was used 
as a control for the replication assay (Fig. 3.B., lane 1). One clonal cell line mat supported pGT/hCG 
replication, CHO-Py-leu (Fig. 3.A., lane 5: Fig. 3.B.. lanes 2 and 3) and expressed a significant amount of 
leukosialin. was selected for further studies. pGT/hCG was kindly provided by Dr. Michiko Fukuda. 

25 

EXAMPLE til 

ISOLATION OF A cDNA SEQUENCE DIRECTING EXPRESSION OF THE HBCAgAOgHAglDE ON 
LEUKOSIALIN 

3Q 

Poly(A) + RNA was isolated from HL-60 promyelocytes, which contain a significant amount of the core 2 
/J1—8rV-acelylglucosaminyltransferase (Saitoh et al„ Supra. (1991)). A cONA expression library, pcDNAI-Hl- . 
60, was prepared (Invitrogen) and the library was screened for clones directing the expression of the T305 
antigen. 

as Plasmid DNA from the pcDNAI-HL-60 cDNA library was transfected into CHO-Py-leu cells using a 
modification of the iipofection procedure, described below (Feigner et al.. Proc. Natl. Acad. Sci. USA 
84:7413-7417 (19B7), which is incorporated herein by reference). CHO-Py-leu cells were grown in 100 mm 
tissue culture plates. When the cells were 20% confluent, they were washed twice with Opti-MEM I 
(GIBCO). Fifty ug Of h'pofectin reagent (Bsthesda Research Laboratories) and 20 Ug of purified plasmid 

40 DNA were each diluted to 1.5 ml with Opti-MEM I, then mixed and added to the cells. After incubation for 6 
hr at 37 'C, the medium was removed, 10 ml of complete medium was added and incubation was continued 
for 16 hr at 37 -C. The medium was then replaced with 10 ml of fresh medium. 

Following a 64 hr period to allow transient expression of the transfected plasmids, the cells were 
detached in PBS/5mM EDTA, pH7.4, for 30 min at 37 *C, pooled, centrifuged and resuspended in cold 

45 PBS/1 QmM EOTA/5% fetal calf serum, pH7.4, containing a 1:200 dilution of ascites fluid containing T305 
monoclonal antibody. The cells were incubated on Ice for i hr, then washed in the same buffer and panned 
on dishes coated with goat anti-mouse IgG (Sigma) (Wysocki and Sato Proc. Natl. Acad. Sci. USA 752B44- 
2848 (1978); Seed & Aruffo Proc. Natl. Acad. Sci. USA 84:3365-3369 (1987), which are incorporated herein 
by reference). T3Q5 monoclonal antibody was kindly provided by Dr. R.I. Fox, Scripps Research Founda- 

50 tion, La Jolla, OA. 

Plasmid Ona was recovered from adherent ceils by the method of Hirt, supra . (1967). treated with Dpnl 
to eliminate plasmids that had not replicated In transfected cells, and transformed into £. coll strain 
MC1061/P3. Plasmid DNA was then recovered and subjected to a second round of screening. E. coti 
transform ants containing plasmids recovered from this second enrichment were plated to yield 8 pools of 
ss approximately 500 colonies each. Replica plates were prepared using methods well-known to those skilled 
in the art (see, for example, Sambrook et al., supra . (1989)). 

The pooled plasmid DNA was prepared from replica plates and transfected into CHO-Py-leu cells. The 
translectants were screened by panning. One plasmid pool was selected and subjected to three subsequent 
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rounds Of selection. On© plasmid, pcDNAI-C2GnT. which directed the expression of the T305 antigen, was 
Isolated. CHO-Py-ieu cells transfeded with pcDNAl-C2GnT express the antigen recognized by T305. 
whereas CHO-Py-leu cells transfected with pcDNAI are negative for T305 antigen (Fig. 4). These results 
show pcDNAI-C2GnT directs the expression of a new determinant on leukosi&lin that is recognized by T305 
s monoclonal antibody. This determinant is the branched hexasaccharide sequence, NeuNAca2-»3Gal/M-^3- 
(NeuNAca2-3Galpi— 4 GlcNAc£1— 6)GalNAc. 

EXAMPLE IV 

to CHARACTERIZATION OF CggnT 

DMA ggguengg! The cDNA insert in plasmid pcONAI-C2GnT was sequenced by the dideoxy chain 
termination method using Sequenase version 2 reagents (United States Biochemical*) (Sanger et aJ. f Proc. 
Natl. Acad, Sci. USA 74:5463-5487 (1977), which is incorporated herein by reference). Both strands were 
is sequenced using 17-mer synthetic ofigonucleotides, which were synthesized as the sequence of the cDMA 
insert became known. 

Plasmid pct)NAI-C2GnT contains a 2105 base pair insert (Rg. 5). The cDNA sequence (SEO. ID. NO. 3) 
ends 1878 bp downstream of the putative translation start site. A poryadenytatlon signal is present at 
nucleotides 1694-1699. The significance of the large number of nucleotides between the polyadenylation 

ao signal and the beginning of the polyadenyl chain is not clear. However, this sequence is NT rich. 

Deduced amino acid sequence: The cONA insert in plasmid pcDNAI-C2GnT encodes a single open 
reading frame in the sense orientation with respect to the pcDNAI promoter (Rg. 5). The open reading 
frame encodes a putative 428 amino acid protein having a molecular mass of 49.790 daftons. 

Hydropathy analysis indicates the predicted protein Is a type II transmembrane molecule, as are all 

2S previously reported mammalian gly cosy I transferases (Schachter, supra . 0991)). In this topology, a nine 
amino add cytoplasmic NHa-terminal segment is followed by a 23 amino acid transmembrane domain 
flanked by basic amino acid residues. The large COOH-terminus consists of the stem and catalytic domains 
and presumably faces the lumen of the Golgi complex. 

The putative protein contains three potential A^glycosylation sites (Fig. 5. asterisks). However, one of 

oo these sites contains a proline residue adjacent to asparagkie and is not likely utilized in vivo . 

No matches were obtained when the C2GnT cDNA sequence and deduced amino acid sequence ware 
compared with sequences listed In the PC/Gene 6.6 data bank. In particular, no homology was revealed 
between the deduced amino acid sequence of C2GnT and other glycosyltransferases, including rV- 
acetyJglucosaminyhransferase I (Sarkar et a). a Proc. Nail Acad. Sci. USA 86:234-238 (1991), which is 

35 incorporated herein by reference). 

mRNA expression: Poly(A)* RNA was prepared using a kit (Stratagene) and resolved by electrophoresis 
on a 1.2% agarose/2.2 M formaldehyde gel, and transferred to nylon membranes (Micro Separations Inc., 
MA) using methods well-known to those skilled in the art (see, for example, Sambrook et al„ supra . (1989))! 
Membranes were probed using the Ecofll insert of pPROTA-C2GnT (see below) radiolabeled with [^Ph 

40 dCTP by the random priming method (Feinberg and Vogelstein. Supra , (1983). Hybridization was performed 
in buffers containing 50% formamide for 24 hr at 42 # C (Sambrook et al. ( supra , (1989)). Following 
hybridization, filters were washed several times in 1xSSPE/D.1% SDS at room temperature and once In 
O/lxSSPE/D.1% SDS at 42 *C. then exposed to Kodak X-AR film at -70 -C. 

Rg. 6 compares the level of core 2 /n-»-6 A/-acetylglucosaminyltransferase mRNA isolated from HL-60 

4B promyelocytes, K562 erythroleukemia cells, and poorly metastatic SP and highly metastatic L4 colonic 
carcinoma cells. The major RNA species migrates at a size essentially identical to the -2.1 kb C2GnT 
cDNA sequence. The same result is observed for HL-60 cells and the two colonic coll lines, which 
apparently synthesize the hexasaccharides. In addition, two transcripts of -3.3 kb and 5.4 kb in size were 
detected in these cell lines. The two larger transcripts may result from differential usage of polyadenylation 

So signals. 

No hybridization occurred with poly(A)* RNA isolated from K562 cells, which lack the hexasaccharide, 
but synthesize the tetrasaccharide (Carlsson et aU supra , (19B6)), which Is Incorporated herein by 
reference. Similarly, no hybridization was observed for poly (A)' RNA Isolated from CHO-Py-leu ceils (Rg. 6, 
lane 1). 

ss 



12 



05fi0747A*J.> 



04/18/2007 11:26 FAX 2022937860 



SU6HRUE 



0039 



EP 0 590 747 A2 



EXAMPLE V 

EXPRESSION OF ENZYMATICALLV ACTIVE fl1->6 JV- ACETYLGLUCOSAMINYLTRANSFERASE 

s In order lo confirm thai C2GnT cDNA encodes for core 2 01—6 r^acerylglucosaminyltransferase. 

enzymatic activity was examined in CHO-Py-leu cells tran$fected with pcDNAl or pcDNAl-C2GnT. Following 
a 64 hr period to allow transient expression, cell lysates were prepared and core 2 01~"6 rV-aceryl- 
glugosaminyltransferase activity was measured. x 

/V-acetylglijeosaminyltransferase assays were performed essentially as described by Saitoh et al.. 
to supra . (1891), Yousefi et al., supra , (1991), and Lee et al., J. Biol. Ghem. 265:20476-20487 (1990), which is 
incorporated herein by reference. Each reaction contained 50 mM MES, pH7.0, 0-5 uCt of UDP-^HJGlcNAc 
in i mM UDP-GlcNAc, 0.1 M GlcNAc, 10 mM Na^EDTA, imM of acceptor and 25 til of either cell lysate, 
cell supernatant or IgG-Seph arose matrix in a total reaction volume oF 50 ul. 

Reactions were incubated for 1 hr at 37' C, then processed by ClB Sep-Pak chromatography (Waters) 
is (Palcic et al., J. BioL Chggv 265:6759-0760 (1090), which is Incorporated herein by reference). Core 2 and 
core 4 01— 6 AA-acetylglucosaminyltrandferase were assayed using the acceptors p-nitrophenyl Gal/91~*3Gal- 
NAc and p-nitrophenyl GtcNAe£l-*3GalNAc, respectively (Toronto Research Chemicals). 

UDP*GtcNAc:<z-Man 01—6 AAacetylglucosamlnyltransferase(V) was assayed using the acceptor 
GlcNActfl— 2Mana1-6G!c-^-0-(CHa)7CH3. The blood group I enzyme, UOP- 
20 GlcNAc:GlcNAe01-* 3Gal01-*4GlcNAc (GlcNAc to Gal) 01—6 A^acetylglucosaminy transferase, was assayed 
using GIcNAc^l^aGal^l^GlcNAc^l^Martol^eMan^l^O^CrfcHCOOCHj or 

GaJ^i-4GlcNAc^1-3Gal^i-4GlcNAc41-3Ga^i-4GicNAciSi-0-(CHa)7CH3 as acceptors (Gu et al., J. 
Biol. Chem. 267:2994-2999 (1992). which is incorporated herein by reference). Synthetic acceptors were 
kindly provided by Dr. Ole Hindsgaul, University of Alberta. Canada 
25 Results of these assays are shown in Table I. Assuming transfection efficiency of the cells is 
approximately 20-30%. the level of enzymatic activity directed by cells transfected with pcDNAI-C2GnT is 
roughly equivalent to the level observed in HL-60 cells. 
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In order to unequivocally estabfish that C2GnT cDNA sequence encodes core 2 £i-»6 A/-acetyi- 
glucosaminyltransrerase, plasmid, pPRQTA-C2GnT was constructed containing the DNA sequence encod- 
ing me putative catalytic domain of core 2 01^*6 W^acetyiglucosaminyUransferase fused in frame wiih the 

5 signal peptide and IgG binding domain Of $. aureus protein A (Fig. 7). The putative catalytic domain is 
contained in a 1330 bp fragment or the C2GnT cDNA that encodes amino acid residues 38 to 428. Plasmid 
pPROTA was kindly provided by Dr. John B, Lowe. 

The polymerase chain reaction (PCR) was used to Insert EcoRI recognition sites on either side of the 
1 330 bp sequence In pcDNAJ~C2GnT DNA. PCR was performed using the synthetic oligonucleotide primers 

io S'-TTTQAATTCCCCTG AATTTGT AAGTQTC AG ACAC-3 ' (SEQ. ID. NO. 5) and 5*- 
TTTG AATTCGCAG AAACCATQ CAG GTTCTQTGA-3' (SEQ. ID. NO. 6) (EcoRI recognition sites underlined). 
The EcoRI siles allowed direct in-frame insertion of the fragment into the unique EcoRI site of plasmid 
pPROTA (Sanchez-Lopez et el., J. Biol. Chem. 263:11892-11899 (1688), which fe Incorporated herein by 
reference). 

15 The nucleotide sequence of the insert as wed as the proper orientation were confirmed by DNA 
sequencing using the primers described above for cDNA sequencing. Plasmid pPROTA-C2GnT allows 
secretion of the fusion protein from transfected cells and binding of the secreted fusion protein by 
insolubllized immunoglobulins. 

Either pPROTA or pPROTA-C2GnT was transfected into COS-1 cells. Following a 64 hr period to allow 

20 transient expression, cell supematants were collected (Kukowska-Latallo et aL. supra . (1930)). Cell super- 
natants were cleared by centrifugetion, adjusted to 0.05% Tween 20 and either assayed directly for core 2 
01—6 rV-acetylglucosaminyltransferase activity or used in IgG-Sepharose (Pharmacia) binding studies. For 
the latter assay, supematants (10 ml) were Incubated batchwise with approximately 300 uf of IgG* 
Seph arose for 4 hr at 4 • C. The matrices were then extensively washed and used directly for grycosyltrans- 

23 f erase assays. 

No core 2 AJ-acetylglucosamlnyUransferase activity was detected in the medium of COS-1 cells 

transfected with the control plasmid, pPROTA. Similarly, no enzymatic activity was associated with IgG- 
Sepharose beads, fn contrast, a significant level of core 2 0l-*6 rV-acetylglucosaminyltransferase activity 
was detected in the medium of COS-i cells transfected with pPROTA-C2GnT- The activity also associated 
30 with the IgG-Sepharose beads (Table ll). No activity was detected in the supernatant following incubation of 
the supernatant with IgG-Sepharose. 
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TABLE IX 

Determination of Enzymatic Activities Directed by 
pPROTA-C2GnZ. 



10 



Acceptors and 
linkages formed 



Radioactivity ( cpm) 
with (+) and without 
(-) acceptor 



ts 



GlcNAcBl 



Galfil-»3GalHAc 
(core 2-GnT) 



109 



1048 



20 



GlcNAcBl 

6 

GlcNAcfl l->3GalNAc 
(core 4-GnT) 



111 



113 



25 



30 



06 



GlcNAcBl 

6 

GlcNAcfll->2Man 
(GnTV) 



GlcNAcBl 

6 

GlcHAcfll-*3Gal 
(I-GnT) 



118 



111 



115 



113 



40 



GlcNAcBl 

6 

Galfll*4GlcNAcfll-*3Gal 
(I-GnT) 



99 



96 



45 COS-1 cells were trans fee ted with pPROTA-C2GnT and the 
conditioned media were incubated with IgG-$epharose. The 
proteins bound to the IgG-Sepharose were assayed for Bl-*6 
W-acetylglucosaminyltransferaeQ activity by using 
appropriate acceptors. The linkages formed are indicated 

so l?Y italics. Similar result* were obtained in three 

independent experiments. 
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EXAMPLE VI 

DETERMINATION OF C2Qr>T SPECIFICITY 

s Four types of j5i — e /^acetylglucosaminyl transferase linkages have been reported, including core 2 and 
core 4 in O-gJycans. l-anligen and a branch attached to mannose that forms tetraantennary /V-glycans (see 
Table II). In order to determine whether these different structures are also synthesized by the cloned C2GnT 
cDNA sequence, enzymatic activity was determined using five different acceptors. 

As shown in Table II, the fusion protein was only active with the acceptor for core 2 formation. The 

10 same was true when the formation of 01—6 rV-acetylglucossrninyl linkage to Internal galactose residues was 
examined (T able ll, see structure at bottom). This result precludes the likelihood that the enzyme encoded 
by the C2QnT cDNA sequence may add rV-acetylglucosamine to a non-reducing terminal galactose. The 
HL-60 core 2 01 -*6 W-acetylglucosaminy transferase is exclusively responsible for the formation of the 
GlcNAc£1-»6 branch on Gal01^3 GaJNAc. 

75 Although the invention has been described with reference to the disclosed embodiments. It should be 
understood that various modifications can be made without departing from the spirit of the invention. 
Accordingly, the invention is limited only by the following claims. 
Lowe et al.. Cell 63:475-464 (1990) 
Brandley et al., Cell 63:861-863 <1990) 

90 Phillips et al. f Science 250:1 13C>1 132 (1990) 
Wal2 et a!.. Science 250:1132-1 135 (1990) 
Higgins et al M J. Bjoj. Cham. 266:6280-6290 (1991) 
Schachtar, Biochem. Cell Biol. 64:163-181 (1986) 

25 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(±) APPLICANT: 

(A) NA ME? La Jolla Cancer Research Foundation 

(B) STREET: 10901 North Torrey Pines Road 

(C) CITY: La Jolla 

(D) STATE: California 
10 (E) COUNTRY: U.S. A, 

(F) POSTAL CODE (ZIP): 92037 

(ii) TITLE OF INVENTION: A NOVEL BET AX -6 

N-ACETYLGLttCOSAMINYLTRANSFERASE, ITS ACCEPTOR MOLECULE 
LEUKOSIALIN AND A METHOD FOR CLONING PROTEINS HAVING 
* ENZYMATIC ACTIVITY 

(iii) NUMBER OF SEQUENCES: 8 

(XV) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 
20 (B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version /1.25 (EPO) 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 900 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS i both 

(D) TOPOLOGY: linear 



30 



<ii> MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 
35 (B) LOCATION: 941.. 900 

(ix) FEATURE: 

(A) NAME/KEY: exon 
i (B) LOCATIONS 91. .192 

(D> OTHER INFORMATION; /note- 'EXON 1*1$ LOCATED IN BOTH 
40 GENOMIC AND cDNA. IN THE cDNA EXON 1» IS 

IMMEDIATELY FOLLOWED BY EXON 2." 

i 

i (ix) FEATURE: 

(A) NAME/ KEY: exon 

(B) LOCATION: 339-. 42B 

„ (D> OTHER INFORMATION : /note- 'EXON 1 IS LOCATED IN 

| GENOMIC DNA' 

I (ix) FEATURE ; 

(A) NAME/KEY; intron 

(B) LOCATION; 193.. 806 

; c 0 (D> OTHER INFORMATION: /note- "THIS SEGMENT OF NUCLEIC 

1 ACID CONSTITUTES INTRON SEQUENCE OF THE cDNA" 
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(ix) FEATURE: 

(A) NAME/KEY: exon 

( B) LOCATION: 007.-900 

(D) OTHER INFORMATION : /note- "EXON 2 IS LOCATED IN BOTH 
GENOMIC AND cDNA- IN THE cDNA EXON 2 IMMEDIATELY 
FOLLOWS EXON l 1 . 1 



(jci) SEQUENCE DESCRIPTION : SEQ ID NO-l: 



15 



25 



30 



TTGGGGACCA 


CAAATGCAAA 


GGAAACCACC 


CTCCCCTCCC 


ACCTCCTCCT 


CTGGACCCTT 


60 


GAGTTCTCAO 


GCTCACATTC 


CCACCACCCA 


CCTCTGAGCC 


CAGCGCTCCC 


TAGCATCACC 


120 


ACTTCCATCC 


CATTCCTCAG 


CCAAGAGCCA 


GGAATCCTGA 


TTCCAGATCC 


CACGCTTCCC 


180 


TGCCTCCCTC 


AGGTGAGCCC 


CAGACCCCCA 


GGCAGCCCGC 


TGGCCCCTGA 


AGGAGCAGGT 


240 


6ATG6TGCTG 


TCTTCGCCCA 


GCAGCTGTGG 


GAGCAGGCGG 


GTGGGGCAGG 


ATGGAGGGGT 


300 


GGGTGGCGTC 


GGTGGAGCCA 


GGCCCCACTT 


CCTTTCCCCT 


TGGGGCCCTG 


TCCTTCCCAG 


360 


TCTTGCCCCA 


GCCTCGGGAG 


GTGGTGGAGT 


GACCTGGCCC 


CAGTGCTGCG 


TCCTTATCAG 


420 


CCGAGCCGCT 


AAGAGGGTGA 


GACTTGGTGG 


GGTAGGGGCC 


TCAGTGGGCC 


TfcGGAATGTG 


4 BO 


CCTGTGGCTT 


GAAAAGACTC 


TGACAGGTTA 


TGATGGGAAG 


AGATTGGGAG 


CCATTGGGCT 


540 


GCACAGGGTC 


AGGGAAGGCC 


AGGAGGGGCT 


GGTCACTGCT 


GGAATCTAAG 


CTGCTGAGGC 


600 


tggacggagc 


CTCAGGATGG 


GGCTGATGGG 


GGAGCTGCCA 


GCATCTGTTC 


CTCTGTCATT 


660 


TCTGATaACA 


GTAAAAGCCA 


GCATGGAAAA 


AACCGTTAAA 


CCGCAGGTTG 


GGCCTGGCCG 


720 


TTGGCAGGGA 


AGTGGGCAGA 


GGGGAGGCCC 


GGCCAGGTCC 


TCCGGCAA.CT 


CCCGCCTGTT 


760 


CTGCTTCTCC 


GGCTGCCCAC 


CTGCAGGTCC 


CAGCTCTTGC 


TCCTGCCTGT 


TTGCCTGGAA 


B40 


ATG GCC ACG CTT CTC CTT CTC CTT GGG GTG CTG 
Mat Ala Thr Leu Leu Leu Leu Leu Gly Val Leu 
15 10 


GTG GTA AGC CCA GAC 
Val Val Ser Pro Asp 
15 


888 



GOT CTG GGG AGC 900 
Ala Leu Gly Ser 
20 



(2) INFORMATION FOR SEQ ID NO :2s 

49 (1) SEQUENCE CHARACTERISTICS s 

(A) LENGTH: 20 amino acids 
(B> TVPEx amino acid 
(D) TOPOLOGY; linear 

Cii) MOLECULE TYPE: protein 

45 <xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Ala Thr Leu Leu Leu Leu Leu Gly Val Leu Val Val Ser Pro Asp 
15 10 15 

Ala Leu Gly Ser 

50 20 
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(2) INFORMATION FOR SEQ ID NO:3s 

(i) SEQUENCE CHARACTERISTICS: 
. (A) LENGTH: 2105 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: cDNA 

70 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 220.. 1504 

Ux) FEATURE: 
« (A) NAME/KEY: polyA signal 

6 (B) LOCATION r 1913.71$1B 

(ix) FEATURE; 

(A) KAME/KEY: misc signal 
(B> LOCATION: 248.7314 
(D) OTHER INFORMATION: /standard name- 
20 * SIGNAL /MEMBRANE -ANCHORING DOMAIN' 

<*i) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

GTGAAGTGCT CACAATGGGG CAGGATGTCA CCTGGAATCA GCACTAAGTG ATTCAGACTT 60 

TCCTTACTTT TAAATGTGCT GCTCTTCATT TCAAOATGCC GTTGCAGCTC TGATAAATCC 120 

AAACTGACAA CCTTCAAGGC CACGACGGAG GGAAAATCAT TGGTGCTTGG AGCATAGAAG 180 

ACTGCCCTTC ACAAAGGAAA TCCCTGATTA TTGTTTCAA ATG CTG AGG ACG TTG 234 

Met Leu Arg Thr Leu 
1 5 

CTG CGA AGG AGA CTT TTT TCT TAT CCC ACC AAA TAG TAC TTT ATG GTT 262 
Leu Arg Arg Arg Lea Pha Ser Tyr Pro Thr Lys Tyr Tyr Pha Mat Val 

10 15 zo i 

35 CTT GTT TTA TCC CTA ATC ACC TTC TCC GTT TTA AGG ATT CAT CAA AAG 330 ! 

Leu Val Leu Ser Leu lie Thr Phe Ser Val Leu Arg lie His Gin Lye 
25 30 35 

CCT GAA TTT GTA AGT GTC AGA CAC TTG GAG CTT GCT GGG GAG AAT CCT 37B 
Pro Glu Phe Val Ser Val Arg Bis Lfiu Glu Leu Ala Gly Glu Asn Pro 
^ 40 45 50 

AGT AGT GAT ATT AAT TGC ACC AAA GTT TTA CAG GCT GAT GTA AAT GAA 4Z6 
Ser Ser Asp He Asn Cys Thr Lys Val Leu Gin Gly Asp Val Asn Glu 
55 60 65 

ATC CAA AAG GTA AAG CTT GAG ATC CTA ACA GTG AAA TTT AAA AAG CGC 474 
45 He Gin Lys Val Lys Leu Glu lie Leu Thr Val Lys Phe Lya Lys Arg i 
70 75 g0 85 

CCT CGG TGG ACA CCT GAC GAC TAT ATA AAC ATG ACC AGT GAG TGT TCT 522 

Pro Arg Trp Thr Pro Asp Asp Tyr He Aau Mat Thr Ser Asp Cys Ser 
90 95 100 

50 



25 
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TCT TTC ATC AAG AGA CGC AAA TAT ATT GTA GAA CCC CTT ACT AAA CAA 570 
Ser Phe He Lys Arg Arg Lys Tyr He Val Glu Pro Leu Sec Lys Glu 
105 110 US 

5 GAG GCG GAG TTT CCA ATA GCA TAT TCT ATA GTG GTT CAT CAC AAG ATT 618 

Glu Ala Glu Phe Pro lie Ala Tyr Ser He Val Val His His Lys lie 
120 125 130 

GAA AtG CTT GAG AGG CTG CTG AGG GCC ATC TAT ATG CCT CAG AAT TTC 666 
Glu Met Leu Asp Arg Leu Leu Arg Ala lie Tyr Met Pro Gin Asn Phe 

10 135 1*0 145 

TAT TGC GTT CAT GTG GAC ACA AAA TCC GAG GAT TCC TAT TTA GOT GCA 714 
Tyr Cys Val His Val Asp Thr Lye Ser Glu Asp Ser Tyr Leu Ala Ala 
150 153 160 16S 



GTG ATG GCC ATC GCT TCC TGT TTT ACT AAT GTC TTT GTG GCC AGC CGA 762 
Val Met Gly lie Ala Ser Cys Phe Ser Aaa Val Phe Val Ala Ser Arg 
170 175 180 

TTG GAG AGT GTG GTT TAT GCA TCG TGC AGC CGG GTT CAG GCT GAC CTC 810 
Leu Glu Ser val Vol Tyr Ala Ser Trp Ser Arg Val Gin Ala Asp Leu 
185 190 195 

AAC TGC ATG AAG GAT CTC TAT GCA ATG AGT GCA AAC TGG AAG TAG TTG 658 
Asn Cys Met Lys Asp Leu Tyr Ala Met Ser Ala Asn Trp Lya Tyr Leu 
ZOO 205 210 

ATA AAT CTT TGT GGT ATG GAT TTT CCC ATT AAA ACC AAC CTA GAA ATT 906 
Xle Asn Leu Cye Gly Met Asp Phe Pro He Lys Thr Asn Leu Glu tie 
215 220 225 

GTC AGG AAG CTC AAG TTG TTA ATG GGA GAA AAC AAC CTG GAA ACG GAG 954 
Val Arg Lys Leu Lye Leu Leu Met Gly Glu ASn Aan Leu Glu Thr Glu 
230 235 240 245 

50 AGG ATG CCA TCC CAT AAA GAA GAA AGG TGG AAG AAG CGG TAT GAG GTC 1002 

Arg Met Pro ser His Lys Glu Glu Arg Trp Lys Lys Arg Tyr Glu Val 
250 255 260 

CTT AAT GGA AAG CTG ACA AAC ACA CGG ACT GTC AAA ATG CTT CCT CCA 1050 
Val Asn Gly Lys Leu Thr Asn Thr Gly Thr Val Lys Met Leu Pro Pro 
265 270 275 



20 



25 



35 



CTC GAA ACA CCT CTC TTT TCT GGC AGT GCC TAC TTC GTG GTC AGT AGG 109B 
Leu Glu Thr Pro Leu Phe Ser Gly Ser Ala Tyr Phe Val Val Ser Arg 
260 285 290 



GAG TAT GTG GGC TAT GTA CTA CAG AAT GAA AAA ATC CAA AAG TTG ATG 1146 

40 Glu Tyr Val Gly Tyr Val Leu Gin Asn Glu Lys He Gin Lys Leu Met 

29S 300 305 

GAG TGG GCA CAA GAC ACA TAC AGC CCT GAT GAG TAT CTC TGG GCC ACC 1194 

Glu Trp Ala Gin Asp Thr Tyr Ser Pro Asp Glu Tyr Leu Trp Ala Thr • 

310 315 320 325 

45 ATC CAA AGG ATT CCT GAA GTC CCG GGC TCA CTC CCT GCC AGC CAT AAG 1243 

He Gin Arg He Pro Glu Val Pro Gly Ser Leu Pro Ala Ser His Lys 

330 335 340 

TAT GAT CTA TCT GAC ATG CAA GCA GTT GCC AGG TTT GTC AAG TGG CAG 1290 

Tyr Asp Leu ser Asp Met Gin Ala Val Ala Arg Phe Val Lys Trp Gin 

60 345 350 355 r 
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10 



20 



25 



00 



as 



40 



46 



TAG TTT GAG GGT GAT GTT TCC AAG GGT GCT CCC TAG CCG CCC TGC GAT 133A 
Tyr Phe Glu Gly Asp Val Sec Lye Gly Ala Pro Tyr Pro Pro Cys Asi 
360 365 370 J * 

GGA GTC CAT GTG CGC TCA GTG TGC ATT TTC GGA CCT GGT GAC TTG AAC 13B6 
Gly Val His Val Arg Ser Val Cys He Phe Gly Ala Gly Asp Leu Asn 66 
3/5 380 3Q5 * 



it) INFORMATION FOR SEQ ID N0:4: 

(1) SEQUENCE CHARACTERISTICS; 

(A) LENGTH? 4 28 amino acids 

(B) TYPE: amino acid 
(O) TOPOLOGY: linear 

(ii> MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION x SEQ 3D NO: 4: 

Met Leu Arg Thr Leu Leu Arg Arg Arg Leu Phe Ser Tyr Pro Tnr Lys 
1 5 10 15 

Tyr Tyr Phe Met Val leu Val Leu Ser Leu tie Thr Phe Ser Val Leu 
20 23 30 

Arg He His Gin Lys Pro Glu phe Val Ser Val Arg His Leu Glu Leu 
35 40 45 

so Ala Gly Glu Asn Pro Ser Ser Aap lie Asn Cys Thr Lye Val Leu Gin 
50 55 60 



22 

arvSDOCiO t£P . 05567 4 7 A3 I » 



14 34 



TGG ATG CTG CGC AAA CAC CAC TTG TTT GCC AAT AAG TTT GAC GTG GAT 
Trp Met Leu Arg Lys His His Leu Phe Ala Asn Lys Phe Asp Val Asp 
390 395 4Q0 405 

GTT GAC CTC TTT GCC ATC CAG TGT TTG GAT GAG CAT TTG AGA CAC AAA u fl9 
Val Asp Leu Phe Ala Us Gin Cys Leu Asp Glu &i* Leu Arg His t£ 
»10 415 * 420 

75 GCT TTG GAG ACA TTA AAA CAC T GACCATTACG GGCAATTTTA TGAACAAGAA 1534 
Ala Leu Glu Thr Leu Lys His 
425 

GAAGGATACA CAAAACGTaC CTTATCTGTT TCCCCTTCCT TGTCAGCGTC GGGAAGATCG 1594 

TAT v AAG TCC TCTTTGGGGC AGGGACTCTA GTAGATCTTC TTGTCAGAGA AGCTGCATGG 1654 

TTTCTGCAGA GCACAGTTAG CTAGAAAGGT GATAGCATTA AATGTTCATC TAGAGTTAAT 1714 

AGTGGGAGGA GTAAAGGTAG CCTTGAGGCC AGAGCAGGTA GCAAGGCATT GTGGAAAGAG 1774 

GGGA.CCAGCG TGGCTCGGGA AGAGGCCGaT GCATAAAGTC AGCCTGTTCC AAGTGCTCAG 10 34 

GCACTTAGCA AAATGAGAAG ATCTGACCTG TGCCAAAACT ATTTTCAGAA TTTTAAATGT 1894 

GACCATTTTT CTGGTATCAA TAAACTTACA GCAACAAATA ATCAAAGATA CAATTAATCT 1954 

GATATTATAT TTGTTGAAAT AGAAATTTGA TTGTACTATA AATGATTTTT GTAAATAATT 2014 

TATATTCTGC TCTAATACTG TAC TGTGT AG TGTCTCTCCG TATGTCATCT CAGGGAGCTT 2074 
AAAATGGGCT TGATTTAACA TTGAAAAAAA A 
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Gly Asp Val Asa Glu tie Gin Lye Val Lys Leu Glu lie Leu Thr Val 
63 70 75 so 

Lye Phe Lys Lys Arg Pro Arg Trp Thr Pro Aep Asp Tyr lie Asn Met 
85 90 95 

Thr Ser Asp Cys Ser Ser Phe He Lys Arg Arg Lys Tyr lie Val Glu 
100 105 no 

Pro Leu Ser Lys Glu Glu Ala Glu Phe Pro He Ala Tyr Ser He Val 
to 115 120 i2 5 

Val His His Lys He Glu Met Leu Aep Arg Leu Leu Are Ala He Tvr 
130 133 140 

Met Pro Gin Asn Phe Tyr Cys Val His Val Aep Thr Lys Ser Glu Asp 
76 145 150 155 160 

Ser Tyr Leu Ala Ala Val Met Gly He Ala Ser Cys Phe Ser Asn Val 
165 170 175 



20 



25 



30 



35 



SO 



Phe Val Ala Ser Arg Leu Glu Ser Val Val Tyr Ala Ser Trp Sar Arg 
160 165 190 

Val Gin Ala Asp Leu Asn Cys Met Lys Asp Leu Tyr Ala Met Ser Ala 
195 200 205 

Asn Trp Lys Tyr Leu He Asa Leu Cys Gly Met Asp Phe Pro He Lys 
210 215 220 

Thr Asn Leu Glu He Val Arg Lye Leu Lys Leu Leu Met Gly Glu Asn 
225 230 235 240 

Asn Leu Glu Thr Glu Arg Met Pro Ser His Lys Glu Glu Arg Trp Lys 
245 250 255 

Lys Arg Tyr Glu Val Val Asn Gly Lys Leu Thr Asn Thr Gly Thr Val 
260 265 270 

Lys Met Leu Pro Pro Leu Glu Thr Pro Leu Phe Ser Gly Ser Ala Tyr 
275 2fl0 285 

Phe Val Val Ser Arg Glu Tyr Val Gly Tyr Val Leu Gin Asn Glu Lys 
290 295 300 

He Gin Lys Leu Met Glu Trp Ale Gin Asp Thr Tyr Ser Pro Asp Glu 
305 310 315 * 320 

Tyr Leu Trp Ala Thr He Gin Arg He Pro Glu Val Pro Gly Ser Leu 
325 330 335 

Pro Ala Ser His Lys Tyr Asp Leu Ser Asp Met Gin Ala Val Ale Ant 
340 345 350 

Phe Val Lys Trp Gin Tyr Phe Glu Gly Asp Val Ser Lys Gly Ala Pro 
355 360 365 

Tyr Pro Pro Cys Asp Gly Val Bis Val Arg Ser Val Cys He Phe Gly 
370 375 380 

Ala Gly Asp Leu Asn Trp Met Leu Arg Lys His His Leu Phe Ala AG* 
3BS 390 395 fcoO 



23 



04/18/2007 11:28 FAX 2022937860 



SU6HRUE 



EP 0 590 747 A2 



10 



16 



20 



30 



35 



Lya Phe Asp Val Asp Val Asp Leu Phe Ala He Gin Cys Leu Asp Glu 
405 410 415 

His Leu Arg Hi.9 Lys Ala Leu Glu Thr Leu Lys His 
420 425' 



(2) INFORMATION FOR SEQ ID HO: 5 : 

(i) SEQUENCE CHARACTERISTICS ; 

(A) LENGTH : 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESSi single 
(t>) TOPOLOGY: linear 

(ii) MOLECULE TYPE; cDNA 



(*i) SEQUENCE DESCRIPTION t SEQ ID N0:5; 
TTTGAATTCC CCTGAATTTC TAAGTGTCAG ACAC 

(2) INFORMATION POR SEQ ID NO;6: 

(i) SEQUENCE CHARACTERISTICS: 
2S (A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D> TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:6: 
TTTGAATTCG CAGAAACCAT GCAGCTTCTC TGA 

<2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: IS base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESSi single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPEi protein 
(v) FRAGMENT TYPE* internal 

(ix) FEATURE: 

(A) NAME y KEY: CDS 

(B) LOCATION s 1..15 

(D) OTHER INFORMATION: /note- "PROTEIN A - C2GNT FUSION 
PROTEIN" 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7- 

GGG AAT TCC CCT GAA 
Gly Asn Ser Pro Glu 
1 5 



1Q (2) INFORMATION FOR SEQ ID NO;8: 

(i) SEQUENCE CHARACTERISTICS t 

(A) LENGTH: S amino acids 
CB) TYPE: amino acid 
(D) TOPOLOGY: linear 

,s (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Gly Asn Ser Pro Glu 
1 5 

20 



Claims 

as 1. A purified human protein or an active fragment thereof having /n — 6 AAacetylglucosaminy I transferase 
activity. 

2- The purified protein of claim 1. wherein said activity is that of UDP-GlcNActGaJ/71 — 3GalNAc (GlcNAc to 
Gal N Ac) 01— 6 AAacetylglucosaminyltransferase. 

30 

3, The purified protein of claim 2, wherein said protein has a relative molecular weight of about 50 kD. 

4. An isolated nucleic add encoding the human protein or active fragment thereof of claim 1. 
ss 6. A vector containing the nucleic acid of claim 4. 

$. The vector of claim 5, wherein said vector is a plasmid. 

7. The vector of claim 5, wherein said vector is pcDNAJ-C26nT. 

40 

fl. A host cell containing the vector of claim 5. 

9. A purified human protein or a fragment thereof that is an acceptor molecule, said acceptor molecule 
being acted upon by the protein of claim 2 having activity which exdusively forms core 2 oligosec- 

4S charide structures in O-giycans. 

10. The acceptor molecule of claim 9, wherein said acceptor molecule is leukosialin* CD43. 

11. An isolated nucleic add encoding the acceptor molecule of claim 9. 

50 

12. A vector containing the nucleic acid of claim 1 1 . 

13- The vector of claim 12, wherein said vector is a plasmid. 
ss 14. The vector of claim 12, wherein said vector is pcDSRa-Ieu. 
15. A host cell containing the vector of claim 12. 
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16. A method of obtaining from a call line, which does not normally contain a protein having catalytic 
activity or an acceptor molecule (or said protein, a nucleic acid encoding said protein having catalytic 
activity comprising: 

a. transfecting said cell line with a DNA sequence encoding the acceptor molecule, wherein the 
5 acceptor molecule is stably expressed in the cell line; 

b. transfecting said cell line with a cDNA library containing said nucleic add In a vector, wherein 
proteins encoded by the transferred cpNA are transiently expressed; 

c. screening the transacted cells for expression of said protein having catalytic activity; and 

d. isolating the nucleic acid encoding the protein having catalytic activity. 

to 

17. The vector of claim 16, wherein said vector replicates in the tranefected cell line. 
18- The vector in clairn 17, wherein said vector Is a plasmid. 

is 19. The vector or claim 16, wherein said vector contains a viral replication origin. 

20. The vector of daim 19, wherein said replication origin is the polyoma virus replication origin. 

21. The cell line of claim 16. wherein said cell line supports replication of a vector. 

20 

22. The cell line of claim 16, wherein said cell line expresses polyoma virus large T antigen. 

23. The cell fine of claim 16. wherein said cell line is the Chinese hamster ovary cell line. 

25 24. The cell tine of claim 23, wherein said ceil line is CHO-Py-leu. 

25, A method of isolating a polypeptide having catalytic activity that forms core 2 oligosaccharide 
structures in 0-glycans, said method comprising growing the host cell of claim 8 under conditions 
which favor expression of a nucleic acid encoding said polypeptide, and isolating said polypeptide so 
30 produced. 
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FIG. 3A 



FIG. 3B 
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